Notes (Week 8 Thursday)
Table of Contents
These are notes I presented in today's lecture, as well as additions and clarifications made afterwards including further answers to some questions asked in class. Many thanks to Craig McLaughlin and Johannes Åman Pohjola for the original notes.
1. Notes
Most General Type ----------------- Example: What's the most general type of recfun f x = fst x + 1 ? f : ∀α.α → α ? -- No....why not? f : ∀α. (Int, α) → Int -- This one. Takeaway: Most general = least commitment, most widely applicable -> Why can't we have that for recursive types? Consider these two recursive types: List = type a. rec t. () + (a, t) Tree = type a. rec t. () + (a, t, t) roll(inl ()) : List@Bool ? roll(inl ()) : Tree@Int ? We just don't know. Takeaway: Values of recursive types will not have a unique most general type. Note: In Haskell these types are data List a = Nil | Cons a (List a) data Tree a = EmptyTree | Node a (Tree a) (Tree a) i.e. Haskell will require unique constructor names, so it doesn't end up with this ambiguity from just using `roll`. -> Don't sums have the same problem? Consider also inl() : () + Int inl() : () + Bool Unlike the recursive types, these *do* have a unique most general type: inl() : () + a Why Func adds to the type env ----------------------------- Consider this attempt to infer the type of a recfun if the Func rule doesn't add f, x to the typing environment Γ: __________________ ___________ Γ ⊢ f : Int -> Int Γ ⊢ x : Int ______________ _________ Γ ⊢ f(x) : Int 1 : Int ____________________ Γ ⊢ f(x) + 1 : Int ___________________________________ Func? Γ ⊢ Recfun f x = f(x) + 1 : Int -> Int The problem is Γ doesn't tell us anything about the f and x, which are local to the recfun. But with f and x added to the environment by Func as follows, we are then be able to prove the upper-left parts of the above derivation. etc. etc. ____________________ f : Int -> Int, x : Int, Γ ⊢ f(x) + 1 : Int ___________________________________ Func Γ ⊢ Recfun f x = f(x) + 1 : Int -> Int Generalising ------------ Γ ⊢ e : τ α ∉ FTV(Γ) ────────────────────── AllI Γ ⊢ e : ∀α.τ The example given on the slide was: let f = (recfun f x = (x, x)) in (fst (f 4), fst (f True)) with the prompt "Where should generalisation happen?" The problem is that this rule is applicable for every possible expression - we could apply it when e is any of the following expressions, as long as we pick an unused type variable α ∉ FTV(Γ) to generalise over: e = (let f = (recfun f x = (x, x)) in (fst (f 4), fst (f True))) e = (recfun f x = (x, x)) ... Making a generalisation rule applicable only to let-expressions (or the top-level of the entire program) means there is no ambiguity about when to apply it, resulting in the kind of deterministic behaviour we want in a type inference algorithm. Type Inference Algorithm Preliminaries -------------------------------------- We'll extend our grammar for typing contexts to include type variable declarations and definitions, and **markers** (important later) typing contexts Γ ::= . | Γ,x:σ | Γ,α | Γ,α:=τ | Γ # Types now contain type variables which may be "free" or "bound" just like term variables in expressions: R α -- rigid type variables -- bound by forall quantifiers F α -- flexible type variables -- introduced during type inference types τ ::= F α | R α | Bool | Int | τ → τ Forall quantified types are called type *schemes*: type schemes σ ::= ∀[α].τ Minimal calculus for now, just lambdas and let-expressions expressions e ::= x | λx.e | e e | let x = e in e Unification ------------ Our rules for unification are specified by the judgement: Γ₁ ⊢ τ₁ ~ τ₂ ⟹ Γ₂ which are defined such that: 1. Γ₁ & Γ₂ contain the same type variables; 2. Γ₂ is **more informative** than Γ₁ in the sense that declared type variables have been given definitions in order for τ₁ ~ τ₂ to hold: Γ₁ ⊑ Γ₂ 3. The information increase is **minimal** (most general) in the sense that it makes the least commitment in order to solve the equation: any other solution Γ₁ ⊑ Γ' factors through Γ₁ ⊑ Γ₂ Worked Unification Example: ---------------------------- Example unification problem: α,β,ρ:=F β,γ ⊢ (α → β) ~ (ρ → (γ → γ)) ⟹ updated context Idea is to update the context with definitions for the unknowns -- flexible/unification type variables -- by decomposing the unification problem. By decomposing the function type, the above has two unification problems: 1. α,β,ρ:=β,γ ⊢ F α ~ F ρ ⟹ Γ₁ 2. Γ₁ ⊢ β ~ (γ → γ) ⟹ Γ₂ for some Γ₁, Γ₂. Notice, the input to problem 2 is the output from problem 1. To solve a unification problem, we move through the context from right-to-left looking to simplify by definition either side of the equation: 1. α, β, ρ:=β, γ ⊢ F α ~ F ρ γ does not occur in the problem, so we can skip it! What we'll use to take these next steps are "unification" rules whose conclusions have the form: Γ ⊢ F α ~ F ρ ⟹ Γ' For this step, we use rule Skip-Type (see further below). ⟹ α, β, ρ:=β ⊢ F α ~ F ρ , γ ρ occurs on the right-hand side and has a definition: substitute! ⟹ α, β ⊢ F α ~ F β, ρ:=β , γ β occurs on the right-hand side, define! ⟹ α, β:=α, ρ:=β , γ Done. 2. α, β:=α, ρ:=β , γ ⊢ β ~ (γ → γ) γ occurs in the type on the right-hand side: a dependency! Dependencies are tracked in a new typing context Δ that sits to the right of the original one Γ, separated by a bar symbol, as follows: Γ | Δ ⊢ foo ~ goo To tackle unification problems of this new form, we use "instantiation" rules whose conclusions have that form. ⟹ α, β:=α, ρ:=β | γ ⊢ β ~ (γ → γ) ρ does not occur: skip! ⟹ α, β:=α | γ ⊢ β ~ (γ → γ), ρ:=β β occurs on the left-hand side: substitute its definition! ⟹ α | γ ⊢ α ~ (γ → γ), β:=α, ρ:=β α occurs on the left-hand side: define, placing all its dependencies (γ) to the left: ⟹ γ, α:=γ → γ, β:=α, ρ:=β Done. Γ₁ ⊢ τ₁ ~ τ₂ ⟹ Γ₂ Unification rule design ----------------------- Define by inductive structure on types: types τ ::= F α | R α | Bool | Int | τ → τ For rigid type variables, the rules are straightforward: α = β ──────────────────────────────Unify-R-Eq Γ ⊢ R α ~ R β ⟹ Γ and similar for Int and Bool. Note, a = b means the string identifiers for type variables a and b are literally identical (and a != b conversely). Similar for arrow, which just recurses in structurally: Γ₁ ⊢ τ₁ ~ ν₁ ⟹ Γ₂ Γ₂ ⊢ τ₂ ~ ν₂ ⟹ Γ₃ ──────────────────────────────Unify-Arrow Γ ⊢ (τ₁ → τ₂) ~ (ν₁ → ν₂) ⟹ Γ₃ Interesting cases when we have: flex-flex and flex-rigid constraints: First: Unification of a flexible and a flexible? typing contexts Γ ::= . | Γ,x:σ | Γ,α | Γ,α:=τ | Γ # Intuition: The further right, the more local; the further left, the more global. We know that things in environment Γ can depend on things to their left, but not on things to their right. A trivial one: ────────────────────────────── Refl Γ ⊢ F α ~ F α ⟹ Γ What do we do? α != β ────────────────────────────── Γ ⊢ F α ~ F β ⟹ Γ' Consider the cases... α != β ρ != α ρ != β Γ ⊢ F α ~ F β ⟹ Γ' ────────────────────────────── Skip-Type Γ,ρ ⊢ F α ~ F β ⟹ Γ', ρ α != β ────────────────────────────── Defn Γ,α ⊢ F α ~ F β ⟹ Γ,α:=β (Note, we also have the symmetric version of this Defn rule - that's what's applied to solve the unification example's part 1 further up.) α != β Γ ⊢ [ρ:=τ](F α) ~ [ρ:=τ](F β) ⟹ Γ' ────────────────────────────── Subst Γ,ρ:= τ ⊢ F α ~ F β ⟹ Γ', ρ := τ where [α:=τ](τ → τ') = ([α:=τ]τ) → ([α:=τ]τ') [α:=τ](R x) ~> R x [α:=τ](F α) ~> τ Γ ⊢ F α ~ F β ⟹ Γ' ────────────────────────────── Skip-Term Γ,x:σ ⊢ F α ~ F β ⟹ Γ',x:σ Second: Unification of a flexible variable and a type that's not a flexible variable τ not a flexvar Γ | [] ⊢ F α ~ τ ⟹ Γ'?? ────────────────────────────── Inst Γ ⊢ F α ~ τ ⟹ Γ' ?? Now we need to apply *instantiation* rules that try to unify the type variable α with any type τ that is not a flexible type variable. These rules carry around a new context Δ recording type variable dependencies of τ that will later need to be put to the left of α. The following are instantiation rules. Cases.... FTV(τ) -- free flexible type variables in τ FTV(F α) = {α} FTV(τ → τ') = FTV(τ) ∪ FTV(τ') .... ρ != α ρ ∈ FTV(τ) Γ | Δ, ρ ⊢ F α ~ τ ⟹ Γ' ────────────────────── Depend Γ,ρ | Δ ⊢ F α ~ τ ⟹ Γ' Note, this rule's hypothesis goes back to being a unification problem: Γ,Δ ⊢ [ρ:=τ'](F α) ~ [ρ:=τ']τ ⟹ Γ' ────────────────────── Subst Γ,ρ:=τ' | Δ ⊢ F α ~ τ ⟹ Γ',ρ:=τ' This rule is a base case that finally unifies α with a suitable τ. Note it has an *occurs check*: α ∉ FTV(τ) ────────────────────── Defn Γ,α | Δ ⊢ F α ~ τ ⟹ Γ,Δ,α:=τ What if α ∈ FTV(τ) ???? F α ~ (F α → F α) an occurs check failure Questions from the class ------------------------ Q: Is the reason we exclude recursive types to avoid cases like this? F α ~ (F α → F α) A: Apart from avoiding the problem I mentioned where `rec t. blah` leads to ambiguity between types from not having uniquely named constructors, I do agree avoiding this case is another benefit. Q: What if we had a type like this? MinHS: rec t. (t -> Int) Haskell: data Blah = MyBlah (Blah -> Int) A: I believe MinHS and Haskell will both allow you to declare types like this, but then neither of these languages prevent you from having non-terminating functions. The Agda documentation has a nice example explaining why Agda requires all self-reference to be in *strictly positive* position, which means never on the left side of an arrow like t is above: https://agda.readthedocs.io/en/latest/language/data-types.html#strict-positivity In short, admitting types with self-reference in a negative position can cause functions on those types to be non-terminating. Agda and Isabelle (both interactive theorem provers) require all functions to be terminating. In Isabelle this is to avoid the possibility of deriving contradictions and I'd assume that's why Agda does it too. (See also https://cgi.cse.unsw.edu.au/~cs4161/week05B.pdf slide 8 - and if you're curious, the point about violating Cantor's theorem is elaborated in https://cgi.cse.unsw.edu.au/~cs4161/week05B_demo.thy) Q: Is the problem with this type that you'd never be able to instantiate it? A: I see it as one of the problems - but I think the main reason for that is that it doesn't have a base case, only an inductive one. The following types have a base case (therefore instantiable) but can still lead to non-terminating functions due to their self-reference being in a negative position. MinHS: rec t. (t -> Int) + 1 Haskell: data Blah = MyBlah (Blah -> Int) | EmptyBlah